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Abstract 

We  present  a  theory  of  excess  stock  market  volatility,  in  which  market  movements  are  due  to 
trades  by  very  large  institutional  investors  in  relatively  illiquid  markets.  Such  trades  generate 
significant  spikes  in  returns  and  volume,  even  in  the  absence  of  important  news  about  funda- 
mentals. We  derive  the  optimal  trading  behavior  of  these  investors,  which  allows  us  to  provide  a 
unified  explanation  for  apparently  disconnected  empirical  regularities  in  returns,  trading  volume 
and  investor  size. 

I.      Introduction 

Ever  since  Shiller  [1981],  economists  have  sought  to  understand  the  origins  of  volatility  in  stock 
market  prices,  which  appears  to  exceed  the  predictions  of  simple  models  with  rational  expectations 
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and  constant  discounting.1   Even  after  the  fact,  it  is  hard  to  explain  changes  in  the  stock  market 
using  only  observable  news  [Cutler,  Poterba  and  Summers  1989;  Fair  2002;  Roll  1988]. 

We  present  a  model  in  which  volatility  is  caused  by  the  trades  of  large  institutions.  Institutional 
investors  appear  to  be  important  for  the  low-frequency  movements  of  equity  prices,  as  shown  by 
Gompers  and  Metrick  [2001].  Understanding  better  the  behavior  of  institutional  investors  also 
sheds  light  on  many  issues,  such  as  momentum  and  positive  feedback  trading  [Chae  and  Lewellen 
2004;  Cohen,  Gompers  and  Vuolteenaho  2002;  Choe,  Kho  and  Stulz  1999;  Hvidkjaer  2005],  bubbles 
[Brunnermeier  and  Nagel  2004],  liquidity  provision  [Campbell,  Ramadorai  and  Vuolteenaho  2005], 
and  the  importance  of  indexing  [Goetzman  and  Massa  2003] .  We  further  this  research  by  analyzing 
how  trading  by  individual  large  investors  may  create  price  movements  that  are  hard  to  explain  by 
fundamental  news. 

In  our  theory,  spikes  in  trading  volume  and  returns  are  created  by  a  combination  of  news  and 
the  trades  by  large  investors.  Suppose  news  or  proprietary  analysis  induces  a  large  investor  to 
trade  a  particular  stock.  Since  his  desired  trading  volume  is  then  a  significant  proportion  of  daily 
turnover,  he  will  moderate  his  actual  trading  volume  to  avoid  paying  too  much  in  price  impact.2 
The  optimal  volume  will  nonetheless  remain  large  enough  to  induce  a  significant  price  change. 

Traditional  measures,  such  as  variances  and  correlations,  are  of  limited  use  in  analyzing  spikes 
in  market  activity.    Many  empirical  moments  are  infinite;  moreover,  their  theoretical  analysis  is 
typically  untractable.3  Instead,  a  natural  object  of  analysis  turns  out  to  be  the  tail  exponent  of  the 
distribution,  for  which  some  convenient  analytical  techniques  apply.    Furthermore,  there  is  much 
empirical  evidence  on  the  tails  of  the  distributions,  which  appear  to  be  well  approximated  by  power 
laws.  For  example,  the  distribution  of  returns  r  over  daily  or  weekly  horizons  decays  according  to 
P(\r\  >  x)  ~  x~^r  where  £r  is  the  tail  or  Pareto  exponent.4    This  accumulated  evidence  on  tail 
behavior  is  useful  to  guide  and  constrain  any  theory  of  the  impact  of  large  investors.  Specifically, 
our  theory  unifies  the  following  stylized  facts, 
(i)  the  power  law  distribution  of  returns,  with  exponent  £r  ~  3; 
(ii)  the  power  law  distribution  of  trading  volume,  with  exponent  £g  ~  1.5; 
(iii)  the  power  law  distribution  of  price  impact; 
(iv)  the  power  law  distribution  of  the  size  of  large  investors,  with  exponent  £5  ~  1. 


'See  also  Campbell  and  Shiller  [1988],  French  and  Roll  [1986],  LeRoy  and  Porter  [1981]. 
2  See  Section  II. C. 

3 The  variance  of  volume  and  the  kurtosis  of  returns  are  infinite.  Section  II  provides  more  details. 
Appendix  1  reviews  the  relevant  techniques. 


Existing  models  have  difficulty  in  explaining  facts  (i)-(iv)  together,  not  only  the  power  law 
behavior  in  general,  but  also  the  specific  exponents.  For  example,  efficient  markets  theories  rely 
on  news  to  move  stock  prices  and  thus  can  explain  the  empirical  finding  only  if  news  is  power  law 
distributed  with  an  exponent  £r  ~  3.  However,  there  is  nothing  a  priori  in  the  efficient  markets 
hypothesis  that  justifies  this  assumption.  Similarly,  GARCH  models  generate  power  laws,  but  need 
to  be  fine-tuned  to  replicate  the  exponent  of  3.5 

We  rely  on  previous  research  to  explain  (iv),  and  develop  a  trading  model  to  explain  (iii).  We 
use  these  facts  together  to  derive  the  optimal  trading  behavior  of  large  institutions  in  relatively 
illiquid  markets.  The  fat-tailed  distribution  of  investor  sizes  generates  a  fat-tailed  distribution  of 
volumes  and  returns.  Whe  we  derive  the  optimal  trading  behavior  of  large  institutions,  we  are  able 
to  replicate  the  specific  values  for  the  power  law  exponents  found  in  stylized  facts  (i)  and  (ii).6 

In  addition  to  explaining  the  above  facts,  an  analysis  of  tail  behavior  may  have  a  number  of 
wider  applications  in  option  pricing,'  risk  management,  and  the  debate  on  the  importance  of  large 
returns  for  the  equity  premium  [Barro  2005;  Rietz  1988;  Routledge  and  Zin  2004;  Weitzman  2005]. 

Our  paper  draws  on  several  literatures.  The  behavioral  finance  literature  [Barberis  and  Thaler 
2003;  Hirshleifer  2001;  Shleifer  2000]  describes  mechanisms  by  which  large  returns  obtain  without 
significant  changes  in  fundamentals.  We  propose  that  these  extreme  returns  often  result  from  large 
idiosyncratic  trades  of  institutions.  The  microstructure  literature  [Biais,  Glosten  and  Spatt  2005; 
O'Hara  1995]  shows  that  order  flow  can  explain  a  large  fraction  of  exchange  rate  movements  [Evans 
and  Lyons  2002]  and  stock  price  movements,  including  the  covariance  between  stocks  [Hasbrouck 
and  Seppi  2001].  Previous  papers  combine  these  behavioral,  microstructure  and  asset  pricing  el- 
ements to  explain  the  impact  of  limited  liquidity  and  demand  pressures  on  asset  prices  [Acharya 
and  Pedersen  2005;  Gompers  and  Metrick  2001;  Pritsker  2005;  Shleifer  1986;  Wurgler  and  Zhu- 
ravskaya  2002].  We  complement  this  research  by  focusing  on  tail  behavior,  partially  in  the  hope 
that  understanding  extreme  events  allows  us  to  understand  standard  market  behavior  better. 

This  article  is  also  part  of  a  broader  movement  utilizing  concepts  and  methods  from  physics 
to  study  economic  issues,  a  literature  sometimes  referred  to  as  "econophysics"  .8    Econophysics 


°Also,  GARCH  models  are  silent  about  the  economic  origins  of  the  tails,  and  about  trading  volume. 

6 This  includes  the  relative  fatness  documented  by  facts  (i),  (ii)  and  (iv)  (note  that  a  higher  exponent  means  a 
thinner  tail).  Since  large  traders  moderate  their  trading  volumes,  the  distribution  of  volumes  is  less  fat-tailed  than 
that  of  investor  sizes.  In  turn,  a  concave  price  impact  function  leads  to  return  distributions  being  less  fat-tailed  than 
volume  distributions. 

7  Our  theory  indicates  that  trading  volume  should  help  forecast  the  probability  of  large  returns.  Marsh  and  Wagner 
[2004]  provides  evidence  consistent  with  that  view. 

Antecedents  are  Simon  [1955]  and  Mandelbrot  [1963].    More  recent  research  includes  Bak,  Chen,  Scheinkman, 


is  similar  in  spirit  to  behavioral  economics  in  that  it  postulates  simple  plausible  rules  of  agent 
behavior,  and  explores  their  implications.  However,  it  differs  by  putting  less  emphasis  on  the 
psychological  microfoundations,  and  more  on  the  results  of  the  interactions  among  agents. 

Section  II  presents  stylized  facts  on  the  tail  behavior  of  financial  variables.  Section  III  then  con- 
tains our  baseline  model  that  connects  together  power  laws.  Section  IV  discusses  various  extensions. 
Section  V  concludes.  Appendix  1  is  a  primer  on  power  law  mathematics. 

II.     The  Empirical  Findings  That  Motivate  Our  Theory 

This  section  presents  the  empirical  facts  that  motivate  our  theory,  and  provides  a  self-contained 
tour  of  the  empirical  literature  on  power  laws. 

II. A.      The  Power  Law  Distribution  of  Price  Fluctuations:  (r  ~  3 

The  tail  distribution  of  returns  has  been  analyzed  in  a  series  of  studies  that  uses  an  ever  increasing 

number  of  data  points  [Jansen  and  de  Vries  1991;  Lux  1996;  Gopikrishnan,  Plerou,  Amaral,  Meyer, 

and  Stanley  1999;  Plerou,  Gopikrishnan,  Amaral,  Meyer,  and  Stanley  1999].    Let  rt  denote  the 

logarithmic  return  over  a  time  interval  At.    The  distribution  function  of  returns  for  the  1,000 

i 
largest  U.S.  stocks  and  several  major  international  indices  has  been  found  to  be:9 

(1)  P{\r\>x) ^-withCr-3. 

Here,  ~  denotes  asymptotic  equality  up  to  numerical  constants.10  This  relationship  holds  for 
positive  and  negative  returns  separately  and  is  best  illustrated  in  Figure  I.  It  plots  the  cumulative 
probability  distribution  of  the  population  of  normalized  absolute  returns,  with  In  a;  on  the  horizontal 
axis  and  lnP(|r|  >  x)  on  the  vertical  axis.  It  shows  that 

(2)  lnP(|r|  >  x)  =  —  Cr  hix  +  constant 


and  Woodford  [1993],  Bouchaud  and  Potters  [2003],  Gabaix  [1999,  2005],  Plerou,  Gopikrishnan,  Amaral,  Meyer,  and 
Stanley  1999],  Levy,  Levy,  and  Solomon  [2000],  Lux  and  Sornette  [2002],  Mantegna  and  Stanley  [1995,  2000].  See 
also  Arthur,  LeBaron,  Holland,  Palmer,  and  Tayler  [1997],  Blume  and  Durlauf  [2005],  Brock  and  Hommes  [1998], 
Durlauf  [1993],  Jackson  and  Rogers  [2005]  for  work  in  a  related  vein. 

To  compare  quantities  across  different  stocks,  we  normalize  variables  such  as  r  and  q  by  the  second  moments  if 
they  exist,  otherwise  by  the  first  moments.  For  instance,  for  a  stock  i,  we  consider  the  returns  r\t  =  {tu  —  n)  /ov.i, 
where  rt  is  the  mean  of  the  r«  and  ay,i  is  their  standard  deviation.  For  volume,  which  has  an  infinite  standard 
deviation,  we  use  the  normalization  q'it  =  qn/qi,  where  qu  is  the  raw  volume,  and  gT  is  the  absolute  deviation: 
<?i  =  \qn  -qitY 

Formally,  f(x)  ~  g  (x)  means  /  (i)  j  g  (x)  tends  toward  a  positive  constant  (not  necessarily  1)  as  x  — >  oo. 


yields  a  good  fit  for  \r\  between  2  and  80  standard  deviations.  OLS  estimation  yields  —  Cr  = 
—3.1  ±  0.1,  i.e.,  (1).  It  is  not  automatic  that  this  graph  should  be  a  straight  line,  or  that  the  slope 
should  be  -3:  in  a  Gaussian  world  it  would  be  a  concave  parabola.  In  the  following,  we  shall  refer 
to  Equation  1  as  uthe  cubic  law  of  returns"  . 

Insert  Figure  I  here 
Insert  Figure  II  here 

Furthermore,  the  1929  and  1987  "crashes"  do  not  appear  to  be  outliers  to  the  power  law 
distribution  of  daily  returns  [Gabaix,  Gopikrishnan,  Plerou,  and  Stanley  2005].  Thus  there  may 
not  be  a  need  for  a  special  theory  of  "crashes":  extreme  realizations  are  fully  consistent  with  a 
fat-tailed  distribution.12 

Equation  1  appears  to  hold  internationally  [Gopikrishnan,  Plerou,  Amaral,  Meyer,  and  Stanley 
1999].  For  example,  Figure  III,  shows  that  the  distribution  of  returns  for  three  different  country 
indices  are  very  similar.13 

Insert  Figure  III  here 

Having  checked  the  robustness  of  the  (r  ~  3  finding  across  different  stock  markets,  Plerou, 
Gopikrishnan,  Amaral,  Meyer,  and  Stanley  [1999]  examine  firms  of  different  sizes.14  Small  firms 
have  higher  volatility  than  large  firms,  as  is  verified  in  Figure  IVa.  Moreover,  the  same  diagram 
also  shows  similar  slopes  for  the  graphs  of  all  four  distributions.10  Figure  IVb  normalizes  the 
distribution  of  each  size  quantile  by  its  standard  deviation,  so  that  the  normalized  distributions  all 


1 '  The  particular  value  £r  ~  3  is  consistent  with  a  finite  variance,  but  moments  higher  than  3  are  unbounded.  (r  ~  3 
contradicts  the  "stable  Paretian  hypothesis"  of  Mandelbrot  [1963],  which  proposes  that  financial  returns  follow  a  Levy 
stable  distribution.  A  Levy  distribution  has  an  exponent  £r  <  2,  which  is  inconsistent  with  the  empirical  evidence 
[Fama  1963;  McCulloch  1996;  Rachev  and  Mittnick  2000]. 

'"Section  IV.D  reports  quotes  from  the  Brady  report,  which  repeatedly  marvels  at  how  concentrated  trading  was 
on  Monday,  October  19,  1987. 

1  The  empirical  literature  has  proposed  other  distributions.  We  are  more  confident  about  our  findings  as  they  rely 
on  a  much  larger  number  of  data  points,  and  hence  quantify  the  tails  more  reliably.  We  can  also  explain  previous 
findings  in  light  of  ours.  Andersen,  Bollerslev,  Diebold,  and  Ebens  [2001]  show  that  the  bulk  of  the  distribution  of 
realized  volatility  is  lognormal.  In  independent  work,  Liu,  Gopikrishnan,  Cizeau,  Meyer,  Peng,  and  Stanley  [1999] 
show  that  while  this  is  true,  the  tails  seem  to  be  power  law. 

HSome  studies  quantify  the  power  law  exponent  of  foreign  exchange  fluctuations.  The  most  comprehensive  is 
probably  Guillaume,  Dacorogna,  Dave,  Miiller,  Olsen,  and  Pictet  [1997],  who  calculate  the  exponent  £r  of  the  price 
movements  between  the  major  currencies.  At  the  shortest  frequency  At  =  10  minutes,  they  find  exponents  with 
average  Qr  =  3.44,  and  a  standard  deviation  0.30.  This  is  tantalizingly  close  to  the  stock  market  findings,  though  the 
standard  error  is  too  high  to  draw  sharp  conclusions. 

15There  is  some  dispersion  in  the  measured  exponent  across  individual  stocks  [Plerou,  Gopikrishnan,  Amaral, 
Meyer,  and  Stanley  1999].  This  is  expected,  as  least  because  measured  exponents  are  noisy.  Proposition  5  makes 
predictions  about  the  determinants  of  a  possible  heterogeneity  in  the  exponents. 


have  a  standard  deviation  of  1.  The  plots  collapse  on  the  same  curve,  and  all  have  exponents  close 
to  (r  ~  3. 

Insert  Figure  IV  here 

The  above  results  hold  for  relatively  short  time  horizons  -  a  day  or  less.16  Longer-horizon  return 
distributions  are  shaped  by  two  opposite  forces.  One  force  is  that  a  finite  sum  of  independent  power 
law  distributed  variables  with  exponent  £  is  also  power  law  distributed,  with  the  same  exponent  £.17 
Thus  one  expects  the  tails  of  monthly  and  even  quarterly  returns  to  remain  power  law  distributed. 
The  second  force  is  the  central  limit  theorem,  which  says  that  if  T  returns  are  aggregated,  the  bulk 
of  the  distribution  converges  to  Gaussian.  In  sum,  as  we  aggregate  over  T  returns,  the  central  part 
becomes  more  Gaussian,  while  the  tails  remain  a  power  law  with  exponent  £,  but  have  an  ever 
smaller  probability,  so  that  they  may  not  even  be  detectable  in  practice.  See  Bouchaud  and  Potters 
[2003,  p.  33-35]  for  an  example.  In  practice,  the  convergence  to  the  Gaussian  is  slower  than  if 
returns  were  independently  and  identically  distributed  (i.i.d.),  and  one  still  sees  fat  tails  at  yearly 
horizons  [Plerou,  Gopikrishnan,  Amaral,  Meyer,  and  Stanley  1999,  Figure  9].  A  likely  explanation 
is  the  autocorrelation  of  volatility  and  trading  activity  [Plerou,  Gopikrishnan,  Amaral,  Gabaix,  and 
Stanley  2000]. 18  »A  useful  extension  of  the  present  model  would  allow  the  desire  to  trade  (or  signal 
occurrences)  to  be  autocorrelated,  and  might  generate  the  right  calibration  of  autocorrelation  of 
volatility  and  slow  convergence  to  a  Gaussian. 

In  conclusion,  the  existing  literature  shows  that  while  high  frequencies  offer  the  best  statistical 
resolution  to  investigate  the  tails,  power  laws  still  appear  relevant  for  the  tails  of  returns  at  longer 
horizons,  such  as  a  month  or  even  a  year. 

II. B.     The  Power  Law  Distribution  of  Trading  Volume:  (q  ~  3/2 

To  better  constrain  a  theory  of  large  returns,  it  is  helpful  to  understand  the  structure  of  large 
trading  volumes.  Gopikrishnan,  Plerou,  Gabaix,  and  Stanley  [2000]  find  that  trading  volumes  for 


1  Our  analysis  does  not  require  exact  power  laws.  It  is  enough  that  an  important  part  of  the  tail  distribution  is  well 
approximated  by  a  power  law.  For  instance,  lognormal  distributions  with  high  variance  are  often  well  approximated 
by  Pareto  distributions.  The  exponent  is  then  interpreted  as  a  local  exponent,  i.e.  (  (x)  =  —xp'  (x)  /p(x)  —  1,  rather 
than  a  global  exponent: 

1 '  This  is  one  of  the  aggregation  properties  of  power  laws  reviewed  in  Appendix  1. 

1  Aggregation  issues  may  also  be  important  to  understand  the  dispersion  of  exponents  [Plerou,  Gopikrishnan, 
Amaral,  and  Stanley  1999]. 

'" Dembo,  Deuschel  and  Duffie  [2004],  Ibragimov  [2005]  and  Kou  and  Kou  [2004]  develop  further  the  importance  of 
fat  tails  in  finance. 


the  1,000  largest  U.S.  stocks  are  also  power  law  distributed:20 

(3)  P  (q  >  x) j-  with  C  ^  3/2. 

The  precise  value  estimated  is  Qq  =  1.53  ±  .07.  Figure  V  illustrates:  the  density  satisfies 
P  (<?)  ~  9""2'5i  i-e-i  (3)-  The  exponent  of  the  distribution  of  individual  trades  is  close  to  1.5.  Maslov 
and  Mills  [2001]  likewise  find  (q  =  1.4  ±  0.1  for  the  volume  of  market  orders. 

Insert  Figure  V  here 

To  test  the  robustness  of  this  result,  we  examine  30  large  stocks  of  the  Paris  Bourse  from 
1995-1999,  which  contain  approximately  35  million  records,  and  250  stocks  of  the  London  Stock 
Exchange  in  2001.  As  shown  in  Figure  V,  we  find  C,q  =  1.5  ±  0.1  for  each  of  the  three  stock 
markets.  The  exponent  appears  essentially  identical  in  the  three  stock  markets,  which  is  suggestive 
of  universality. 

The  low  exponent  (g  ~  3/2  indicates  that  the  distribution  of  volumes  is  very  fat  failed,  and 
trading  is  very  concentrated.  Indeed,  the  1  percent  largest  trades  represent  28.5  percent  (±0.6 
percent)  of  the  total  volume  traded.21 

The  power  law  of  individual  trades  continues  to  hold  for  volumes  that  are  aggregated  (for  a 
given  stock)  at  the  horizon  At  =  15  minutes  [Gopikrishnan,  Plerou,  Gabaix,  and  Stanley  2000]: 

(4)  P  (Q  >  x)  ~  -J-  with  Cq  m  3/2. 

We  refer  to  Equation  3-4  as  the  "half-cubic  law  of  trading  volume". 

It  is  intriguing  that  the  exponent  of  returns  should  be  3  and  the  exponent  of  volumes  should 
be  1.5.  To  see  if  there  is  an  economic  connection  between  those  values,  we  turn  to  the  relation 
between  return  and  volume. 

II. C.      The  Power  Law  of  Price  Impact:  r  ~  V1 

The  microstructure  literature  generally  confirms  that  substantial  trades  can  have  a  large  impact. 
Chan  and  Lakonishok  [1993,  1995]  estimate  a  range  of  0.3  to  1  percent;  Keim  and  Madhavan  [1996] 


"  We  define  volume  as  the  number  of  shares  traded.  The  dollar  value  traded  yields  very  similar  results,  since,  for 
a  given  security,  it  is  essentially  proportional  to  the  number  of  shares  traded. 

"'The  0.1  percent  largest  trades  represent  9.6  ±0.3  percent  of  the  total  volume  traded.  We  computed  the  statistics 
on  the  100  largest  stocks  of  the  Trades  and  Quotes  database  in  the  period  1994-5. 


find  4  percent  for  smaller  stocks.  There  are  also  many  anecdotal  examples  of  large  investors  affecting 
prices:  see  Brady  [1988],  Corsetti,  Pesenti,  and  Roubini  [2002],  Coyne  and  Witter  [2002].22 

A  simple  calculation  illustrates  why  one  can  expect  that  a  large  fund  can  move  the  market 
significantly.  The  typical  yearly  turnover  of  a  stock  is  50  percent  of  the  shares  outstanding  [Lo  and 
Wang  2001]:  hence  daily  turnover  is  0.5/250  =  0.2  percent  based  on  250  trading  days  per  year. 
Consider  a  moderately  large  fund,  e.g.,  the  30th  largest  fund.  At  the  end  of  2000,  such  a  fund  held 
0.1  percent  of  the  market  and  hence,  on  average,  0.1  percent  of  the  capitalization  of  a  given  stock.23 
To  sell  its  entire  holding,  the  fund  will  have  to  absorb  0.1/0.2  or  half  of  the  daily  turnover.  This 
supports  the  idea  that  large  funds  are  indeed  large  compared  to  the  liquidity  of  the  market,  and 
that  price  impact  will  therefore  be  an  important  consideration. 

We  next  present  evidence  that  the  price  impact  r  of  a  trade  of  size  V  scales  as: 

(5)  r  ~  k\P, 

with  fc  >  0,  0  <  7  <  1,  which  yields  a  concave  price  impact  function  [Hasbrouck  1991,  Hasbrouck 
and  Seppi  2001;  Plerou,  Gopikrishnan,  Gabaix,  and  Stanley  2002].  The  parameterization  7  =  1/2 
is  often  used,  e.g.,  by  Barra  [1997],  Gabaix,  Gopikrishnan,  Plerou,  and  Stanley  [2003],  Grinold  and 
Kahn  [1999],  Hasbrouck  and  Seppi  [2001]. 

Equation  5  implies  (r  =  (v/l  by  rule  (42)  in  Appendix  1.  Hence,  given  (r  =  3  and  (y  =  3/2, 
the  value  7  =  1/2  is  a  particularly  plausible  null  hypothesis.  From  this  relationship,  we  see  a 
natural  connection  between  the  power  laws  of  returns  and  volumes. 

The  exact  value  of  7  is  a  topic  of  active  research.  We  report  here  evidence  on  the  null  hypothesis 
7  =  1/2.  We  start  from  the  benchmark  where,  in  a  given  time  interval,  n  blocks  are  traded,  with 
volumes  Vi,...,Vn,  of  independent  signs  £j  =  ±1  with  equal  probability.  Aggregate  volume  is 
Q  =  Y^i=\  ^'  and  aggregate  return  is: 


(6)  r  =  u  +  k^T,eiVi 


1/2 


"See  also  Chiyachantana,  Jain,  Jiang,  and  Wood  [2004]  for  international  evidence,  and  Jones  and  Lipson  [2001] 
and  Werner  [2003]  for  recent  U.S.  evidence. 

"3It  had  $19  billion  in- assets  under  management.  The  total  market  capitalization  of  The  New  York  Stock  Exchange, 
the  Nasdaq  and  the  American  Stock  Exchange  was  $18  trillion. 


where  u  is  some  other  orthogonal  source  of  price  movement.24  Then, 


E  \r2  I  Ql  =  el  +  k2E 


o*  +  k2Q  +  0. 


(7)  E  [r2  |  Q]  =  a2u  +  k2Q. 

Insert  Figure  VI  here 

Our  results  of  Figure  VI  reveals  an  affine  relation  predicted  by  Equation  7  for  large  volumes  Q, 
rather  than  any  clear  sign  of  concavity  or  convexity.  A  formal  test  that  we  detail  in  Appendix  3 
confirms  this  relation. 

Measuring  price  impact  and  its  dependence  on  order  size  is  a  complex  problem  due  to  the 
following  reasons.  First,  order  flow  and  returns  are  jointly  endogenous.  To  our  knowledge,  virtually 
all  empirical  studies  including  ours,  suffer  from  this  lack  of  exogeneity  in  order  flow.25 

Second,  the  unsplit  size  of  orders  is  unobservable  in  most  liquid  markets.  One  observes  the 
size  of  individual  trades  q,  not  the  size  of  the  desired  block  V.  If  one  does  not  pay  attention 
to  aggregation,  different  exponents  of  price  impact  are  measured,  depending  on  the  time  horizon 
chosen  [Plerou,  Gopikrishnan,  Gabaix,  and  Stanley  2002,  2004;  Farmer  and  Lillo  2004]. 26 

Third,  order  flow  is  autocorrelated  [Froot,  O'Connell  and  Seasholes,  2001;  Bouchaud,  Gefen, 
Potters,  and  Wyart  2004;  Lillo  and  Farmer  2004].  This  autocorrelation  could  come  from  the  actions 
of  different  traders.  It  is  also  predicted  by  models  of  optimal  execution  of  trades  [Almgren  and 
Chriss  2000;  Berstimas  and  Lo  1998;  Gabaix,  Gopikrishnan,  Plerou,  and  Stanley  2003],  as  large 
transactions  are  split  into  smaller  pieces.27 

Although  the  empirical  evidence  we  gathered  is  suggestive,  measuring  the  curvature  7  of  price 
impact  more  accurately  will  require  better  data  and  a  technique  to  address  the  endogeneity  of  order 
flow.  In  particular,  it  would  require  knowing  desired  trading  volumes,  magnitude  of  price  impact 
and  split  of  trades  for  a  set  of  large  market  participants.  In  the  meantime,  we  consider  evidence  such 


"4Via  Equation  6,  a  model  such  as  ours  provides  a  foundation  for  stochastic  clock  representations  of  the  type 
proposed  by  Clark  [1973]. 

An  exception  is  Loeb  [1983],  who  collected  bids  on  different  size  blocks  of  stock.  Barra  [1997]  and  Grinold  and 
Kahn  [1999,  p.  453]  report  that  the  best  fit  of  the  Loeb  data  is  a  square  root  price  impact. 

'  In  a  related  way,  part  of  the  linearity  of  Equation  7  can  arise  because  in  some  simple  models  total  volume  and 
squared  returns  depend  linearly  on  the  number  of  trades  [Plerou,  Gopikrishnan,  Amaral,  Gabaix,  and  Stanley  2000]. 

27 If  the  trades  are  executed  in  the  same  time  window,  Equation  7  still  holds.   If  they  do  not,  the  estimate  of  7  is 
typically  biased  downward  [Plerou,  Gopikrishnan,  Gabaix,  and  Stanley  2004], 


as  Figure  VI  as  supportive  of  a  linear  relationship  between  volume  and  squared  return.  However, 
it  is  possible  that  the  true  relationship  is  different,  or  may  vary  from  market  to  market.  This  is 
why  we  present  a  theory  with  a  general  curvature  7. 

II. D.      The  Power  Law  Distribution  of  the  Size  of  Large  Investors:  £s  ~  1 

It  is  highly  probable  that  substantial  trades  are  generated  by  very  large  investors.  This  motivates 
us  to  investigate  the  size  distribution  of  market  participants.  A  power  law  formulation: 

1 

(8)  P(5>x)^_L 

often  yields  a  good  fit. 

The  exponent  £g  ~  1,  often  called  Zipf's  law,  is  particularly  common.  This  relation  is  true  for 
both  cities  [Zipf  1949;  Gabaix  and  Ioannides  2004]  and  firms  [Axtell  2001;  Okuyama,  Takayasu, 
and  Takayasu  1999;  Fujiwara,  Di  Guilmi,  Aoyama,  Gallegati,  and  Souma  2004].  If  the  distribution 
of  firms  in  general  follows  Zipf's  law,  it  is  plausible  to  hypothesize  that  the  distribution  of  money 
management  firms  in  particular  follows  Zipf's  law.  Indeed,  Pushkin  and  Aref  [2004]  find  this  is  the 
case  for  U.S.  baAk  sizes,  measured  by  assets  under  management. 

We  investigate  firms  for  which  money  management  is  the  core  business:  mutual  funds.28  We 
use  CRSP  to  obtain  the  size  (dollar  value  of  assets  under  management)  of  all  mutual  funds  29 
'  from  1961-1999.  For  each  year  t,  we  estimate  the  power  law  exponent  £  of  the  tail  distribution 
(20  percent  cutoff)  via  OLS.  We  find  an  average  coefficient  Q  =  1.10,  with  a  standard  deviation 
across  years  of  0.08.30  The  Hill  estimator  technique  gives  a  mean  estimate  Q  =  0.93  and  a  standard 
deviation  of  0.07.  Hence  we  conclude  that,  to  a  good  approximation,  mutual  fund  sizes  follow  a 
power  law  distribution  with  exponent: 

(9)  (s  *  1. 

For  this  paper,  we  can  take  this  distribution  of  the  sizes  of  mutual  funds  as  a  given.  It  is,  in 
fact,  not  difficult  to  explain.  One  can  apply  the  explanations  given  for  cities  [Simon  1955;  Gabaix 


"  Here  we  sketch  the  main  findings.  Gabaix,  Ramalho,  and  Reuter  [2005]  present  much  more  detail. 
"9The  x  funds  of  Fidelity,  for  instance,  count  as  x  different  funds,  not  as  one  big  "Fidelity"  fund. 
'"'We  cannot  conclude  that  the  standard  deviation  on  our  mean  estimate  is  0.08  (1999  —  1961  +  1)~    ■  The  estimates 
are  not  independent  across  years,  because  of  the  persistence  in  mutual  fund  sizes. 
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1999;  Gabaix  and  Ioannides  2004]  to  mutual  funds.  Suppose  that  the  relative  size  Su  of  a  mutual 
fund  i  follows  a  random  growth  process  Su  =  S^t-i  (1  +  £it),  with  en  i.i.d.  and  mean  0.  Add  a 
minor  element  of  friction  to  small  funds  to  ensure  a  steady  state  distribution;  for  instance,  very 
small  funds  are  terminated  and  are  replaced  by  new  funds.  Then,  this  steady  state  distribution 
follows  Zipf's  law  with  C,s  =  1. 

Gabaix,  Ramalho  and  Reuter  [2005]  develop  this  idea  and  show  that  these  assumptions  are 
verified  empirically.  This  means  that  the  random  growth  of  mutual  funds  generically  lead  their  size 
distribution  to  satisfy  Zipf's  law,  i^g  =  1. 

Insert  Figure  VII  here 

It  is  only  in  the  past  30  years  that  mutual  funds  have  come  to  represent  a  large  part  of  the  mar- 
ketplace. It  would  be  interesting  to  have  evidence  on  the  size  distribution  of  financial  institutions 
before  mutual  funds  became  important.  For  instance,  pension  funds  of  corporations  are  likely  to 
follow  Zipf's  law,  as  the  number  of  employees  in  firms  follow  Zipf's  law. 

The  evidence  we  present  here  is  necessarily  tentative.  Estimating  a  power  law  with  a  relatively 
small  number  of  points  is  very  difficult,  and  all  estimators  require  somewhat  arbitrary  parameters 
[Embrechts,  Kluppelberg,  and  Mikosch  1997].  Furthermore,  we  had  access  to  only  a  subset  of  the 
participants  in  the  U.S.  market.  Other  important  participants  are  hedge  funds,  pension  funds, 
and  proprietary  trading  desks,  and  foreign  institutions.  It  would  be  useful  to  weight  the  funds  by 
their  leverage  and  their  annual  turnover.  Nevertheless,  given  that  Zipf's  law  (Equation  9)  has  been 
found  to  describe  the  size  of  many  other  entities,  such  as  banks  and  firms  in  general,  and  appears 
to  describe  well  the  upper  tail  of  the  empirical  distribution  of  mutual  funds,  we  view  Equation  9 
as  a  good  benchmark. 

II. E.      Summary  and  Paradoxes 

The  facts  summarized  in  this  section  present  important  challenges.  First,  economic  theories  have 
difficulties  in  explaining  the  power  law  distribution  of  returns,  as  the  efficient  market  theory,  and 
GARCH  models,  need  to  be  fine  tuned  to  explain  why  the  distribution  of  returns  would  have  an 
exponent  of  3. 


'it  may  be  useful  to  give  a  short  proof.  Suppose  the  process  is:  dSt  =  StadBt-  The  steady  state  density  p(S) 
satisfies  the  forward  Kolmogorov  equation  0  =  dtp  =  \-j§i  {c2S2p(S)).  This  implies  p(S)  =  k/S2  for  a  constant  k, 
and  a  cumulative  distribution  P  (S  >  x)  =  k/x,  i.e.,  Zipf's  law. 
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Second,  it  is  surprising  that  the  Pareto  exponent  of  trading  volume  is  (q  ~  1.5,  while  that  of 
institution  size  is  £g  ~  1.  In  models  with  frictionless  trading,  all  agents  have  identical  portfolios 
and  trading  policies,  except  that  they  are  scaled  by  the  size  S  of  the  agents  (which  corresponds  to 
wealth).  Hence  frictionless  trading  predicts  that  the  distribution  of  trading  volume  of  a  given  stock 
should  reflect  the  distribution  of  the  size  of  its  investors,  i.e.  (q  =  Cs  —  I-32  However,  we  find  that 
Cg  >  Cs-33  A  likely  cause  is  the  cost  of  trading;  large  institutions  trade  more  prudently  than  small 
institutions,  because  price  impact  is  monotonically  increasing  in  trade  size. 

Finally,  the  basic  price  impact  model  [Kyle  1985]  predicts  a  linear  relation  between  returns  and 
volume,  which  would  imply  (T  =  £q.  To  explain  why  Cg/Cr  is  close  to  1/2,  we  require  a  model  with 
curvature  of  price  impact  7  =  1/2. 

We  now  present  a  model  that  attempts  to  resolve  the  above  paradoxes.34 

III.      The  Model 

We  consider  a  large  fund  in  a  relatively  illiquid  market.  We  first  describe  a  rudimentary  model 
for  the  price  impact  of  its  trades.  Next,  we  link  the  various  power  law  exponents;  this  represents 
the  core  contribution  of  this  paper.  One  could  employ  different  microfoundations  for  price  impact 
without  changing  our  conclusions. 

III. A.     A  Simple  Model  to  Generate  a  Power  Law  Price  Impact 

Before  presenting,  in  Section  III.B,  the  core  of  the  model,  we  first  present  a  simple  microfoundation 
for  the  square  root  price  impact.  The  basic  model  of  Kyle  [1985]  predicts  a  linear  price  impact. 
Subsequent  models,  such  as  Seppi  [1990],  Barclay  and  Warner  [1993],  and  Keim  and  Madhavan 
[1996],  generate  a  concave  impact  in  general.  Zhang  [1999]  and  Gabaix,  Gopikrishnan,  Plerou,  and 
Stanley  [2003]  produce  a  square  root  function  in  particular.30    The  model  used  in  this  section  is 


''"Solomon  and  Richmond  [2001]  have  proposed  a  model  that  relies  on  a  scaling  exponent  of  wealth  £s  =  3/2.  We 
are  sympathetic  to  this  approach  that  links  wealth  to  volumes.  In  the  present  study  we  use  the  size  distribution 
of  institutions,  rather  than  individual  wealth,  because  most  very  large  trades  are  likely  to  be  done  by  institutions 
rather  than  by  private  individuals.  Also,  the  Pareto  exponent  of  wealth  and  income  is  quite  variable  [e.g.,  Davies  and 
Shorrocks  2000;  Piketty  and  Saez  2003]. 

The  lower  the  power  law  exponent,  the  fatter  the  tails  of  the  variable.  See  Appendix  1. 

ii  Gabaix,  Gopikrishnan,  Plerou,  and  Stanley  [2003]  presents  a  reduced  form  of  some  elements  of  the  present  article. 

io Gabaix,  Gopikrishnan,  Plerou,  and  Stanley  [2003]  predicts  that  a  trade  of  size  V  will  be  traded  into  N  =  V1^" 
smaller  chunks.  This  has  the  advantage  of  generating  a  power  law  distributions  of  the  number  of  trade  with  exponent 
Cn  =  2£v  =  3,  which  is  close  to  the  empirical  value  [Plerou,  Gopikrishnan,  Amaral,  Gabaix,  and  Stanley  2000].  We 
did  not  keep  it  in  the  current  model,  because  we  wanted  to  streamline  the  microfoundation  of  price  impact. 
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a  formalized  version  of  a  useful  heuristic  argument,  sometimes  called  the  "Barra  model"  of  Torre 
and  Ferrari  [Barra  1997]. 

We  consider  a  single  risky  security  in  fixed  supply,  with  a  price  p  (t)  at  time  t.  The  large  fund 
("he")  buys  or  sell  the  security  from  a  liquidity  supplier  ("she").  The  timing  of  the  model  is  as 
follows: 

At  time  t  =  0,  the  fund  receives  a  signal  M  about  mispricing.  M  <  0  is  a  sell  signal,  M  >  0 
is  a  buy  signal.  Without  loss  of  generality,  we  study  a  buy  signal.  The  analysis  is  symmetric  for  a 
sell  signal. 

At  t  =  1  —  2e  (e  is  a  small  positive  number),  the  fund  negotiates  a  price  with  the  liquidity 
supplier.  For  simplicity,  we  assume  liquidity  provision  is  competitive  so  that  the  fund  has  full 
bargaining  power.  The  liquidity  supplier  sells  to  the  fund  the  quantity  V  of  shares,  at  a  price  p  +  R, 
where  p  =  p  (1  —  2e)  is  the  price  before  impact  and  R  is  the  price  concession,  or  full  price  impact.36 

At  t  =  1  —  e,  the  transaction  is  announced  to  the  public. 

At  t  =  1,  the  price  jumps  to  p  (1)  =  p  +  n  (V),  where  n  (V)  is  the  permanent  price  impact.  The 
difference  between  7r  and  R  is  the  temporary  price  impact  t  =  R  —  tt.  Equilibrium  will  determine 
the  value  of  the  permanent  price  impact  n(V)  and  the  price  concession  R(V). 

From  t  =  1  onwards,  the  price  follows  a  random  walk  with  volatility  a: 

(10)  P{t)=p  +  TT{V)  +  aB(t), 

where  B  is  a  standard  Brownian  motion  with  B  (1)  =  0.  Also,  at  t  =  1,  the  liquidity  supplier  starts 
replenishing  her  inventory.  She  continuously  meets  sellers  who  are  willing  to  sell  her  a  quantity  Vdt 
of  the  stock  at  price  p(t):  she  is  a  price  taker  as  she  can  credibly  assure  that  she  is  not  informed.  The 
liquidity  supplier  continues  to  buy  shares  until  her  inventory  is  fully  replenished,  which  happens 
after  a  time  T  =  V/V.  The  price  continues  to  evolve  according  to  (10). 

The  liquidity  provider  benefits  from  the  temporary  price  impact  r,  but  then  faces  price  uncer- 
tainty as  she  replenishes  her  inventory  [Grossman  and  Miller  1988].  To  evaluate  these  effects,  we 


To  keep  the  mathematics  simple,  the  impact  is  additive,  and  the  price  otherwise  follows  a  random  walk.  It  is  easy, 
though  cumbersome,  to  make  the  price  impact  proportional  and  the  log  price  follow  a  random  walk.  Our  conclusions 
about  the  power  law  exponents  would  not  change. 

7  We  wish  to  add  two  comments  about  the  timing  of  the  model.  In  our  model,  the  large  fund  trades  in  one  block 
(at  time  t  =  1),  and  the  liquidity  provider  trades  in  many  smaller  chunks  (at  time  (  £  (1,1  +  T]).  Alternative  timing 
assumptions  would  leave  the  scaling  relations  unchanged  (12),  with  the  same  7.  Also,  the  exchange  between  the  large 
fund  and  the  liquidity  supplier  is  an  "upstairs"  block  trade.  In  an  upstairs  trade,  the  initiator  typically  commits  not 
to  repeat  the  trade  too  soon  in  the  future.  This  prevents  many  market  manipulation  strategies  that  might  otherwise 
be  possible  with  a  non-linear  price  impact,  such  as  those  analyzed  by  Huberman  and  Stanlz  [2004]. 
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assume  that  the  liquidity  provider  has  the  following  mean  variance  utility  function  on  the  total 
amount  W  of  money  earned  during  the  trade 

(11)  U  =  E[W]  -  X[var  (W)]s/2  , 

with  A  >  0  and  5  >  0.  The  liquidity  supplier  requires  compensation  equal  to  Xas  to  bear  a 
risk  of  standard  deviation  cr,  i.e.  has  "6-th  order  risk  aversion."38  With  standard  mean- variance 
preferences,  5  =  2.  In  many  cases,  a  better  description  of  what  behavior  is  first-order  risk  aversion, 
which  corresponds  to  5  =  l.39 

One  justification  for  first-order  risk  aversion  comes  from  psychology.  Prospect  theory  [Kah- 
neman  and  Tversky  1979]  presents  psychological  evidence  for  this  behavior,  which  has  also  been 
formalized  in  disappointment  aversion  [Gul  1991;  Backus,  Routledge,  and  Zin  2005].  Second,  first- 
order  risk  aversion  is  frequently  needed  to  calibrate  quantitative  models,  such  as  Epstein  and  Zin 
[1990]  and  Barberis,  Huang,  and  Santos  [2001].  A  third  justification  is  institutional,  as  (11)  can 
reflect  a  value  at  risk  penalty,  where  A  is  the  size  of  the  penalty,  and  var  (W)  '  is  proportional 
to  the  value  at  risk.  Another  institutional  justification  is  via  the  Sharpe  ratio.  If  a  trader  uses  a 
rule  to  accept  trades  if  and  only  if  their  Sharpe  ratio  is  greater  than  A,  then  he  will  behave  as  if  he 
exhibits  first-order  risk  aversion. 

Proposition  1  The  setup  of  this  section  generates  the  temporary  price  impact  function: 

,    (12)  r{V)  =  HV'r 

with  H  =  \a5/  (3V)S/2  and 

7=y-l. 

For  future  reference,  it  is  useful  to  state  separately  our  central  case. 

Proposition  2  If  the  liquidity  provider  is  first  order  risk  averse,  then  the  price  impact  increases 
with  the  square  root  of  traded  volume: 

7=1/2 


Essentially  all  non-expected  utility  theories  need  a  postulate  on  how  different  gambles  are  intergrated.  Here,  we 
assume  that  the  liquidity  provider  evaluates  individually  the  amount  W  earned  in  the  trade. 

3gThe  model  also  generates  a  square  root  price  impact  with  a  different  specification  that  generates  first  order  risk 
aversion,  for  instance  the  loss  averse  utility  function:  U  =  E  [max  (W,  0)]  +  A£  [min  (W,  0)]  with  A  >  1. 
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and 

/  V  \  1/2 

(13)  t(V)  =  \o(^ 

a  is  the  daily  volatility  of  the  stock,  and  A  the  risk  aversion  of  the  liquidity  provider. 

In  practice,  V  is  likely  to  be  proportional  to  the  daily  trading  volume.  Hence  the  scaling 
predictions  of  Equation  13  can  be  almost  directly  examined. 

The  proof  is  in  Appendix  2.  The  intuition  is  that  the  liquidity  provider  needs  a  time  T  =  V/V 
to  buy  back  the  V  shares.  During  that  time,  the  price  diffuses  at  a  rate  a.  Hence  the  liquidity 
provider  faces  a  price  uncertainty  with  standard  deviation  ayT  ~  uyV .  If  the  liquidity  provider 
is  first  order  risk  averse,  the  price  concession  r  is  proportional  to  the  standard  deviation,  hence 
r  ~  ay/V ,  i.e.,  Equation  13. 

To  close  the  model,  we  need  to  determine  both  the  permanent  and  the  full  price  impact.  The  de- 
termination of  these  two  variables  typically  depends  on  the  fine  details  of  the  information  structure 
processed  by  the  other  market  participants.  We  use  a  somewhat  indirect  route,  which  drastically 
simplifies  the  analysis. 

Assumption  1  We  assume  that  the  market  uses  a  linear  rule  to  determine  the  full  price  impact, 

(14)  R{V)  =  Bt(V) 

for  some  B  >  0.    Section  IV. A  presents  conditions  under  which  the  linear  rule  (14)  is  actually 
optimal. 

Assumption  1  closes  the  price  impact  part  of  the  model. 
Proposition  3  The  above  setup  generates  the  price  concession  function 

(15)  R(V)  =  hVi 

where  h  =  BH,  and  H  and  7  are  determined  in  Propositions  1  and  2. 
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III.B.      The  Core  Model:  Behavior  of  a  Large  Fund 

We  now  lay  out  the  core  of  our  model.  The  fund  periodically  receives  signals  about  trading  oppor- 
tunities, which  indicate  that  the  excess  risk-adjusted  return  on  the  asset  is  stMtC.  st,Mt  and  C 
are  independent.  st  =  ±1  is  the  sign  of  the  mispricing.  Mt  is  the  expected  absolute  value  of  the 
mispricing.  Mt  is  drawn  from  a  distribution  /  (M),  which  we  assume  to  be  not  too  fat-tailed. 

Assumption  2  We  assume  that  M  is  not  too  fat-tailed:  E  [M1+1/7]  <  oo. 

The  model  misspecification  risk  C  captures  uncertainty  over  whether  the  perceived  mispricing 
is  in  fact  real.  For  example,  the  fund's  predictive  regressions  may  result  from  data  mining,  or  the 
mispricing  may  have  since  been  arbitraged  away.  C  can  take  two  values,  0  and  C* .  If  C  =  0,  the 
signals  the  fund  perceives  are  pure  noise,  and  the  true  average  return  on  the  perceived  mispricings 


C 


1,  so  that  M  represents  the  expected 


is  0.  If  C  =  C*,  the  mispricings  are  real.  We  specify  E 
value  of  the  mispricing. 

The  fund  has  S  dollars  in  assets.  If  it  buys  a  volume  Vj  of  the  asset,  and  pays  a  price  concession 
R  (Vt),  the  total  return  of  its  portfolio  is: 

(16)  rt  =  Vt(CMt-R(Vt)+ut)/S 


where  ut  is  mean  zero  noise. 

If  the  model  is  wrong,  expected  returns  are: 

(17)  E\rt\C  =  Q\=-VtR{Vt)/S. 

We  assume  that  the  manager  has  a  concern  for  robustness.  He  does  not  want  his  expected  return 
to  be  below  some  value  —A  percent  if  his  trading  model  is  wrong.  Formally,  this  means: 

(18)  E\rt  |  C  =  0J  >  -A. 

Equation  18  can  be  justified  in  several  ways.  One  is  a  psychological  attitude  towards  model 
uncertainty,  developed  in  depth  by  Gilboa  and  Schmeidler  [1989]  and  Hansen  and  Sargent  [2005]. 
Second,  Equation  18  is  a  useful  rule  of  thumb,  that  can  be  applied  without  requiring  detailed  infor- 
mation about  the  fine  details  of  model  uncertainty.  A  third  explanation  is  delegated  management 
[Shleifer  and  Vishny  1997].  If  trader  ability  is  uncertain,  investors  may  wish  to  impose  a  constraint 
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such  as  Equation  18  to  prevent  excessive  trading. 

To  simplify  the  algebra,  we  assume  that,  subject  to  the  robustness  constraint,  the  manager 
wants  to  maximize  the  expected  value  of  his  excess  returns  -E^r].40  We  now  summarize  this. 

Definition  1  Suppose  that  the  fund  has  S  dollars  under  management.  The  fund's  optimal  policy 
is  a  function  V(M,S)  that  specifies  the  quantity  of  shares  V  traded  when  the  fund  perceives  a 
mispricing  of  size  M.  It  maximizes  the  expected  returns  E  [rt]  subject  to  the  robustness  constraint 
(18): 


(19)  max  E\rt]  subject  to  E 

V(M,S) 


rt  I  C  =  0    >  -A 


III.C.      Optimal  Strategy  and  Resulting  Power  Law  Exponents 

We  can  now  derive  the  large  fund's  strategy.  Given  Equation  16,  Definition  1  is  equivalent  to: 

1    f°° 
max   -  /      V{M,S)(M -R{V(M,S)))f(M)dM 
V(M,S)  b  Jo 

s.t.  ~  I      V(M,S)R  {V  (M,  S))  f  (M)  dM  >  -A. 
J   Jo 

Appendix  2  establishes  the  following  Proposition. 

Proposition  4  If  constraint  (18)  binds,  the  optimal  policy  is  to  trade  a  volume: 

(20)  V  (M,  S)  =  vMl^Sll{l+l). 
The  price  change  after  the  trade  is: 

(21)  R(M,S)  =  hv^MSl/{-l+l) 

for  a  positive  constant  v,  defined  in  Equation  48,  which  is  increasing  in  A  and  decreasing  in  h. 

Equation  21  means  that  price  movements  reflect  both  the  intensity  of  the  perceived  mispricing 
M ,  and  the  size  of  the  fund  S.    Concretely,  a  large  price  movement  can  come  from  an  extreme 


40One  might  prefer  the  formulation  raaxv(M.s)  E  [u  (r)]  subject  to  E  [u  (r)  |  C  =  0]  >  u  {  —  Ft),  with  a  concave  utility 
u.  Fortunately,  this  does  not  change  the  conclusions  in  many  instances,  such  as  u  (r)  =  —  e~ar ,  a  >  0.  On  the  other 
hand,  with  a  non-linear  function  u  the  derivations  are  more  complex,  as  they  rely  on  asymptotic  equalities,  rather 
than  exact  equalities.  To  keep  things  simple,  we  use  the  linear  representation  (19). 
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signal  or  the  trade  of  a  large  fund  [Easley  and  O'Hara  1987]. 

In  the  remaining  analysis,  we  assume  that  (18)  holds  over  the  support  of  S. 

Assumption  3  The  robustness  constraint  (18)  binds  for  all  funds  in  the  market  above  a  certain 
size. 

A  simple  calibration  presented  in  Appendix  2  shows  that  Assumption  3  holds  for  funds  that 
manage  less  than  S*  =  $21  trillion  dollars.  Assumption  3  is  not  very  stringent.  Alternatively, 
Section  IV. C  shows  a  way  to  ensure  Assumption  3  without  any  finite  size  effects.41 

We  next  derive  the  distribution  of  volume  and  price  changes. 

Proposition  5  The  traded  volume  and  the  price  changes  follow  power  law  distributions  with 
respective  exponents: 


(22)  Cv  =  min[(l+7)Cs,7C 


(23)  Ge  =  min 


Ml 


1  +  -  JCs.Cm 


Equation  21  implies  that  the  distribution  of  price  movements  reflects  both  the  "news"  (perhaps 
coming  from  proprietary  analysis),  as  reflected  in  M ,  as  well  as  the  size  S  of  the  agents  that  act 
on  the  news.  Equation  23  illustrates  the  resulting  exponent.  In  equilibrium,  it  is  the  fatter  of  the 
two  tails  of  signals  and  sizes  that  matters.  Mathematically,  this  comes  from  the  properties  (38) 
of  power  laws:  the  tail  exponent  of  the  product  of  two  independent  random  variables  X\  and  X2 
is  equal  to  the  tail  exponent  of  the  more  fat-tailed  variable,  i.e.,  is  the  lower  of  the  exponents  of 
X\  and  X%-  Economically,  this  means  that  the  polar  case,  where  large  investors  affect  the  tail  of 
trading  volume,  is  captured  when  Cm  >  ( 1  +  ~ )  Cs-  Then,  we  get: 

(24)  Cv  =  (l+7)Cs 

(25)  Cr  =  (l  +  -\  Cs 


4 'Such  cutoffs  are  generally  present  when  handling  power  laws,  and  are  sometimes  called  "border"  or  "finite  size" 
effects.  The  cutoff  affects  only  very  little  predictions.  For  instance,  it  affects  the  power  law  exponent  of  returns  only 
by  a  factor  10-3  if  a  large  fund  has  a  size  S  =  10~35",  which  is  a  plausible  empirical  order  of  magnitude. 
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Equation  23  then  means  that,  when  there  is  a  very  large  movement,  it  is  more  likely  to  come  from 
the  actions  of  a  very  large  institution  (the  S  term),  rather  than  an  objectively  important  piece 
of  news  (the  M  term).  This  potential  importance  of  a  large  institution  may  explain  why,  during 
the  Long  Term  Capital  Management  crisis,  the  October  1987  crash,  and  the  events  studied  by 
Cutler,  Poterba,  and  Summers  [1989],  prices  moved  in  the  absence  of  significant  news  items.  In 
the  context  of  our  theory,  the  extreme  returns  occurred  because  some  large  institutions  wished  to 
make  substantial  trades  in  a  short  time  period. 

Proposition  5  says  that  when  the  distribution  of  the  size  of  institutions  is  more  fat-tailed,  volume 
and  returns  are  also  more  fat-tailed.  However,  when  the  curvature  7  of  price  impact  is  smaller, 
returns  are  less  fat-tailed,  but  volumes  are  more  fat-tailed.  The  reason  is  that  large  institutions 
trade  more  moderately  when  the  price  impact  is  steeper.  We  now  apply  Proposition  5  to  our 
baseline  values. 

Proposition  6  With  a  square  root  price  impact  (7  =  1/2)  and  Zipf 's  law  for  financial  institutions 
(Qs  —  l)i  volumes  and  returns  foUow  power  law  distributions,  with  respective  exponents  of  3/2  and 
3. 

(26)  Cv  =  3/2 

(27)  Cfi  =  3. 

These  exponents  are  the  empirical  values  of  the  distribution  of  volume  and  returns. 

Proposition  6  captures  our  explanation  of  the  origins  of  the  cubic  law  of  returns,  and  the  half- 
cubic  law  of  volumes.  Random  growth  of  mutual  funds  leads  to  Zipf 's  law  of  financial  institutions, 
(s  =  1-  The  model  of  Section  III. A  leads  to  a  power  law  price  impact  with  curvature  7  =  1/2.  As 
large  funds  wish  to  lessen  their  price  impacts,  their  trading  volumes  are  less  than  proportional  to 
their  size.  This  generates  a  power  law  distribution  of  the  size  of  trades  that  is  less  fat-tailed  than 
the  size  distribution  of  mutual  funds.  The  resulting  exponent  is  £y  =  3/2,  which  is  the  empirical 
value.  Trades  of  large  funds  create  large  returns,  and  indeed  the  power  law  distribution  of  returns 
with  exponent  (r  =  3. 
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IV.      Robustness  and  Extensions 

IV.A.      Permanent  versus  Transitory  Price  Impact 

So  far  we  have  analyzed  the  full  price  impact  R,  which  is  the  sum  of  a  permanent  component  ir  and 
transitory  component  r:  R  =  n  +  r.  We  provide  a  sufficient  condition  that  will  ensure  that  the 
permanent  and  the  full  price  impact  are  proportional.  In  a  Bayesian  framework,  the  price  impact 
must  come  from  an  inference,  which  from  Proposition  4  is: 


(28)  ?r  (V)  =  E 


M  I  hV1  =  hv^M^Sl/{l+l) 


The  conditional  expectation  (28)  is  complicated  and  can  be  non-linear.  It  is  difficult  to  see  how 
agents  would  apply  Bayes'  rule  to  compute  (28),  which  requires  knowing  the  distribution  of  M,  and 
M  is  not  a  directly  observable  quantity.  However,  these  difficulties  vanish  in  a  class  of  cases  -  when 
agents  use  (28)  with  the  belief  that  (,m  <  £r.  The  case  where  they  believe  Qm  =  Cr  1S  particularly 
plausible.  If  one  does  not  know  the  distribution  of  mispricings  perceived  by  other  agents,  one  might 
hypothesize  that  it  is  close  to  the  distribution  of  returns.  This  motivates  the  following  Proposition. 

Proposition  7  Suppose  that  updaters  performing  (28)  believe  £m  <  (j?-  Then,  the  exponent  (^ 
of  the  permanent  price  impact  is  equal  to  the  exponent  (n  of  the  full  price  impact,  and  is  given  by 
Proposition  5, 


(29)  &-=  Cr  =  min 


1  +  -  )   .  ' 


If  the  updaters  believe  Cm  <  0?>  there  is  a  constant  b  >  0  s.t.,  for  large  volumes,  the  permanent 
price  impact  is  a  fraction  b  is: 

(30)  n{V)  =  E[M\  V]  ~  6V7 

If  updaters  performing  (28)  believe  Qm  =  Cr,  then  n  (V)  =  VL  (V),  where  L  is  a  "slowly  varying" 
function  that  varies  more  slowly  than  any  polynomial  (see  Appendix  1). 

Proposition  7  presents  sufficient  conditions  for  it  (V)  to  preserve  the  power  law  price  impact 
under  Bayesian  updating,  and  thus  to  justify  Assumption  1. 
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IV.B.     Multiple  Stocks 

The  model  can  easily  be  extended  to  multiple  stocks.  Suppose  stock  i  has  a  power  law  impact 
Ri  (V)  =  hjV1 ,  that  the  signal  M's  are  independent  across  stocks,  and  the  model  misspecification 
risk  C  is  common  across  stocks.4  The  trader's  program  is  to  maximize  the  expected  profit  from 
trading  over  all  stocks, 

1  f00 

maX  C  J2   /       Vi  (M»  S)  (Mi  ~  Ri  (Vi  (M'  5)))  /*  (Mi)  dM> 

Vi(Mi,S),i=l...n  J  *—r  Jo 

subject  to  the  robustness  constraint  that  he  does  not  lose  more  than  A  percent  in  price  impact 
costs: 


1  f°° 

-J2j     Vl(Ml)Rl(V(M))fl(Ml)dMl<A 


Following  the  proof  of  the  main  Proposition,  one  can  show  that  the  solution  is  hiVi  (Mi,  51)7  = 
KMlS1^l+1\  where  K  does  not  depend  on  i  and  S.  Hence,  the  power  law  exponents  derived  in 
Propositions  5-6  follow. 

IV. C.      Different  Quality  of  Signals  Across  Firms 

We  now  allow  the  quality  of  signal  M  to  differ  across  funds,  and  show  that  this  does  not  affect 

our  results.  We  assume  that  fund  /  receives  signals  distributed  according  to  M  =  x/m>  where  Xf 

is  the  quality  of  the  fund's  signals,  and  the  distribution  of  m  is  the  same  across  funds.  Following 

the  proof  of  Proposition  4,  the  optimal  trading  quantity  of  a  fund  of  size  S  is  still,  for  a  constant 

K  =  (A//i)1/(1+7): 

I      i  i      i 

M-iS^-y  m->Si+-> 

V  (m,  S)  =  K j—  =  K- 


E 


i±3 
M    -r 


E 


1+2 

m  i 


1       ! 

1+7 


as  M  =  Xfm-    The  average  quality  Xf  °f  the  signals  disappears.    Hence,  one  still  obtains  (y  = 
min  [(1+7)  Cs,  iCm]  and  </j  =  Cv/l- 

In  general,  one  expects  larger  firms  to  have  a  higher  x-  F°r  instance,  if  signals  are  generated 
according  to  a  production  function  x  (F)  =  FK  where  F  denotes  investment  in  research,  then  the 
optimal  investment  for  a  fund  satisfies  maxp  CFKS1^1+')^  —  F,  for  a  constant  C.  Hence  F  ~ 
SU-oWt)  and  the  quality  of  signals  is  x  ~  Se  for  6  =  jk/  [(1  -  k)  (1  +  7)]. 


'It  is  easy  to  verify  that  C  could  be  also  specific  to  each  stock,  or  to  each  one  of  different  classes  of  stocks. 
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This  framework  allows  us  to  provide  a  microfoundation  for  Assumption  3  without  any  upper 
cutoff.  The  proof  of  Proposition  4  shows  that  Assumption  3  holds  if: 

p.g[M"-^][(l+7)]-(1+1/7)       E[mW][(l+7)}-W»  ^  . 

D   <    ■ — — — J O       1      ~  O  1-K 

h^A  h^A 

which  holds  if  k  >  1/2  and  S  is  large  enough.  Thus  Assumption  3  is  automatically  verified  if  the 
production  function  of  market  research  rises  faster  than  \  =  F1^- 

IV.D.     Discussion  and  Questions  for  Future  Research 

Is  it  reasonable  to  believe  that  there  are  institutions  large  enough  to  cause  the  power  law  distribution 
of  returns?  In  view  of  the  empirical  facts,  we  believe  so.  The  large  volumes  in  Figure  V,  which 
can  be  1,000  times  bigger  than  the  median  trades,  must  come  from  very  large  traders.  They  are 
also  associated  with  extreme  price  movements  (Figure  VT).  However,  a  natural  analysis  would  be 
to  investigate  directly  whether  extreme  movements  without  news  [Cutler,  Poterba,  and  Summers 
1989]  are  caused  by  a  small  number  of  large  institutional  investors.  The  growing  availability  of 
databases  that  track  individual  trades  may  allow  such  a  study  to  be  conducted  in  the  near  future. 
Note  that  the  existence  of  prime  movers  does  not  preclude  that,  subsequently,  many  traders  will 
move  in  the  same  way.  Quantifying  the  importance  of  idiosyncratic  movements  of  large  trades 
versus  correlated  movements  of  beliefs  of  most  traders  would  be  interesting.43 

One  prominent  example  of  a  large  fund  disrupting  the  market  is  Long  Term  Capital  Man- 
agement. Its  collapse  created  a  volatility  spike  that  did  not  subside  for  several  months.  Our 
contribution  is  a  model  of  the  initial  impulse  -  the  form  and  the  power  law  distribution  of  the 
initial  disruption  by  a  large  trade.  We  leave  to  future  research  the  important  task  of  modeling  the 
specifics  of  the  cascade  that  followed  the  initial  impulse.44  We  speculate  that  the  empirical  facts 
we  present,  and  our  baseline  model  of  initial  impulses,  will  be  useful  for  this  future  research. 

A  second  example  is  the  Brady  [1988]  report  on  the  1987  crash.  On  the  crash  day  of  Monday, 
October  19,  1987,  "this  trading  activity  was  concentrated  in  the  hands  of  surprisingly  few  institu- 
tions. ...  Sell  programs  by  three  portfolio  insurers  accounted  for  just  under  $2  billion  in  the  stock 
market.  ...  Block  sales  by  a  few  mutual  funds  accounted  for  about  $900  million  of  stock  sales,"  on 


'  Gabaix  [2005]  finds  that  the  idiosyncratic  movements  of  large  firms  explain  a  substantial  fraction  of  macroeco- 
nomic  activity,  and  Canals,  Gabaix,  Vilarrubia,  and  Weinstein  [2005]  find  that  idiosyncratic  shocks  explain  a  large 
fraction  of  international  trade. 

4,1[Abreu  and  Brunnermeier  2003;  Bernardo  and  Welch  2004;  Gennotte  and  Leland  1900;  Romer  1993;  Greenwald 
and  Stein  1991]  also  present  elements  for  a  theory  of  crashes. 
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a  total  of  $21  billion  traded  (p.  v)  and  "One  portfolio  insurer  alone  sold  $1.3  billion"  (p.  111-22). 
In  the  first  half  hour  of  trading,  "roughly  25  percent  of  the  volume  ...  came  from  one  mutual  fund 
group"  (p.  30).  The  report  concludes  that  "much  of  the  selling  pressure  was  concentrated  in  the 
hands  of  surprisingly  few  institutions.  A  handful  of  large  investors  provided  the  impetus  for  the 
sharpness  of  the  decline"  (p. 41).  Of  course,  some  of  the  investors  in  the  Brady  report  are  program 
traders,  which  amplify  existing  movements,  rather  than  cause  them.  Also,  our  model  is  still  too 
limited  to  allow  the  rich  dynamic  analysis  suggested  by  the  Brady  report.  Nonetheless,  the  evidence 
from  the  report  is  strongly  suggestive  of  the  hypothesis  that ,  a  few  traders  move  a  relatively  illiquid 
market. 

Our  theory  suggests  a  number  of  research  angles.  First,  it  would  be  desirable  to  study  fully 
dynamic  extensions  of  the  model.  The  analysis  becomes  much  more  difficult  [see  e.g.,  Vayanos  2001, 
Gabaix,  Gopikrishnan,  Plerou,  and  Stanley  2003],  but  the  simplicity  of  the  empirical  distributions 
suggests  that  a  simple  dynamic  theory  of  large  events  may  be  within  reach.45 

Second,  it  would  be  interesting  to  study  the  distribution  of  fund  "effective"  size  (assets  multiplied 
by  leverage)  across  classes  of  stocks.  Proposition  5  predicts  that  the  more  fat-tailed  the  size 
distribution  of  traders,  the  more  fat-tailed  the  distributions  of  volume  and  returns.  Investigating 
this  prediction  directly  might  explain  a  cross  sectional  dispersion  of  .power  law  exponents. 

Third,  our  model  predicts  that  the  total  price  impact  cost  paid  by  a  fund  of  size  S  will  be 
proportional  to  S,  and  that  the  total  volume  traded  with  sizable  price  impact  will  be  proportional 
to  SVU+i).  Testing  this  proposition  directly  would  be  useful. 

Fourth,  the  model  suggests  a  particularly  useful  functional  form  for  "illiquidity" ,  which  corre- 
sponds more  closely  to  the  prefactor  of  Proposition  3:46 


(31)  H 


\n\ 


(32)  S,^cov(\rt\,V?) 


varVt 


Again,  the  evidence,  and  some  models,  suggests  7  =  1/2,  but  other  values  may  prove  better  suited. 
Expressions  31  and  32  are  likely  to  be  more  stable  than  other  measures.  Indeed,  volume  is  a  fat- 
tailed  variable  (it  has  infinite  variance),  so  using  a  square  root  of  volume  is  likely  to  yield  a  more 


45Engle  and  Russell  [1998]  and  Liesenfeld  [2001]  present  interesting  empirical  investigations  of  the  dynamic  relations 
between  trading  and  returns. 

4   One  could  even  calculate  the  two  expressions  only  for  volumes  above  a  certain  threshold,  e.g.,  the  mean  volume. 
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7  = 

c9 

'  Cr 

1 

1 

1 

stable  measure  than  volume  itself.  Furthermore,  the  model  also  suggests  that  H  and  H'  will  be 
proportional  to  cr/M1/2,  where  a  is  the  volatility  of  the  stock  and  M  is  its  market  capitalization. 

Fifth,  our  approach  suggests  a  way  to  estimate  the  power  law  exponent  of  price  impact,  7,  and 
the  power  law  exponent  of  the  distribution  of  financial  institutions,  (s,  f°r  instance  across  markets. 
One  first  estimates  separately  the  power  law  exponents  of  volumes  and  returns,  £9  and  £r.  Then 
one  defines  the  estimators  7  and  Cs  by: 

(33) 
(34) 

Proposition  5  indicates  that  these  are  consistent  estimates  of  7  and  (s  m  the  polar  case  where 

Cm>Cs(1+7)/7-47 

Finally,  the  theory  makes  predictions  about  the  comovements  in  returns,  volume,  and  signed 
volume  [the  sum  of  volumes  traded  on  a  price  increase  minus  volume  traded  on  a  price  decrease). 
Its  variant  in  Gabaix,  Gopikrishnan,  Plerou,  and  Stanley  [2003]  adds  predictions  in  the  number  of 
trades  and  signed  number  of  trades.  The  results  show  non-linear  patterns,  and  the  results,  reported 
in  Figure  3  of  Gabaix,  Gopikrishnan,  Plerou,  and  Stanley  [2003],  show  a  quite  encouraging  fit 
between  theory  and  data. 

V.      Conclusion 

This  paper  proposes  a  theory  in  which  large  investors  generate  significant  spikes  in  returns  and 
volume.  We  posit  that  the  specific  structure  of  large  movements  is  due  to  the  desire  to  trade  of 
sizable  institutional  investors,  stimulated  by  news.  The  distribution  of  fund  sizes,  coupled  with 
large  traders'  moderation  of  their  trading  volumes  and  a  concave  price  impact  function,  generates 
the  Pareto  exponents  3  and  3/2  for  the  distribution  of  returns  and  volumes. 

We  introduce  some  new  questions  that  finance  theories  should  answer.  Matching,  as  we  do, 
the  quantitative  empirical  regularities  established  here  (in  particular  explaining  the  exponents  of 
3  and  3/2  from  first  principles  rather  than  by  assumption)  should  be  a  sine  qua  non  criterion  for 
the  admissibility  of  a  model  of  volume  and  volatility.  We  hope  that  the  regularities  we  establish 
will  constrain  and  guide  future  theories.  Given  its  simple  structure,  the  present  model  might  be  a 


'it  is  tempting  to  call  Equation  34  a  "reciprocity  law"  that  holds  irrespective  of  7. 
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useful  point  of  departure  for  thinking  about  these  issues. 

Appendix  1:  Some  Power  Law  Mathematics 

A.  Definitions 

We  present  here  some  basic  facts  about  power  law  mathematics,  and  show  how  their  aggregation 
properties  make  them  especially  interesting  for  both  theoretical  and  empirical  work.  They  also  show 
how  our  predictions  are  robust  to  other  sources  of  noise. 

A  random  variable  X  has  power  law  behavior  if  there  is  a  (x  >  0  such  that  the  probability 
density  p(x)  follows: 

(35)  p(x)  ! 


xCa-+i 


for  x  — >  oo,  and  a  constant  C.  This  implies  [e.g.  Resnick,  1987,  p. 17]  that  the  "counter-cumulative" 
distribution  function  follows: 

(36)  P{x>x)~-^. 

A  more  general  definition  is  that  there  is  a  "slowly  varying"48  function  L  (x)  and  a  (x  s.t. 
p(x)  ~  L  (x)  /x^x+1,  so  that  the  tail  follows  a  power  law  up  to  slowly  varying  corrections. 

£x  is  the  (cumulative)  power  law  exponent  of  X.  A  lower  exponent  means  fatter  tails:  Ca'  < 
£y  implies  that  X  has  fatter  tails  than  Y,  hence  the  large  X's  are  (infinitely,  at  the  limit)  more 
frequent  than  large  Y's. 

If  a  is  a  constant,  E  \\X\a]  =  oo  for  a  >  £x>  and  E  [|^"|a]  <  oo  for  0  <  a  <  (x-  For  instance,  if 
returns  have  power  law  exponents  £r  =  3,  their  kurtosis  is  infinite,  and  their  skewness  borderline 
infinite.49  If  all  moments  are  finite  (e.g.,  for  a  Gaussian  distribution),  the  formal  power  law  exponent 
is  Of  =  co. 


4SL(x)  is  said  to  be  slowly  varying  [e.g.,  Embrechts,  Kluppelberg,  and  Mikosch  1997,  p. 564]  if  for  all  t  > 
0,  limI_00  L  (tx)  J L  (x)  =  1.  Prototypical  examples  are  L  =  a  and  L  (x)  =  a\nx  for  a  non-zero  constant  a. 

4!,This  makes  the  use  of  the  kurtosis  invalid.  As  the  theoretical  kurtosis  is  infinite,  empirical  measures  of  it  are 
essentially  meaningless.  As  a  symptom,  according  Levy's  theorem,  the  median  sample  kurtosis  of  T  i.i.d.  demeaned 

variables  n,...,rT,  with  k.t  =  (eLi  rVT)  I  (Y,I=i  r?/T)  >  increases  to  +oo  like  T1/3  if  Cr  =  3.  The  use  of  kurtosis 
should  be  banished  from  use  with  fat-tailed  distributions.  As  a  simple  diagnostic  for  having  "fatter  tail  than  from 
normality",  we  would  recommend,  rather  than  the  kurtosis,  quantile  measures  such  as  P  (\(r  —  r)  /ov|  >  1.96)  /.05  — 1, 
which  is  positive  if  tails  are  fatter  than  predicted  by  a  Gaussian. 
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B.  Transformation  Rules 

Power  laws  have  excellent  aggregation  properties.  The  property  of  being  distributed  according 
to  a  power  law  is  conserved  under  addition,  multiplication,  polynomial  transformation,  min,  and 
max.  The  general  rule  is  that,  when  we  combine  two  power  law  variables,  "the  fattest  (i.e.,  the 
one  with  the  smallest  exponent)  power  law  dominates."  Indeed,  for  X\, ...,  Xn  independent  random 
variables,  and  a  a  positive  constant,  we  have  the  following  formulas: 

(37)  Cx1+...+xn  =  min  (C^,...,  CO 

(38)  Cx1,..-A-„  =  min(Cxi,-,Cx„) 

(39)  CmaxfA'!,...,^)  =min(Cx1,-,Cxn) 

(40)  (min(Xu...,Xn)  =  Cxi  +  -  +  Cxn 

(41)  CaX  =  CX 

(42)  Cx«  =  ^. 

a 

For  instance,  if  X  is  a  power  law  variable  for  £y  <  oo,  and  Y  is  power  law  variable  with  an 
exponent  £y  >  (x,  or  even  normal,  lognormal  or  exponential  variable  (so  that  £y  =  oo),  then 
X  +  Y,X  Y,  max(X,  V)  are  still  power  laws  with  the  same  exponent  (x-  Hence  multiplying  by 
normal  variables,  adding  non-fat  tail  noise,  or  summing  over  i.i.d.  variables  preserves  the  exponent. 
This  makes  theorizing  with  power  law  very  streamlined.  Also,  this  gives  the  empiricist  hope  that 
those  power  laws  can  be  measured,  even  if  the  data  is  noisy:  although  noise  will  affect  statistics 
such  as  variances,  it  will  not  affect  the  power  law  exponent.  Power  law  exponents  carry  over  the 
"essence"  of  the  phenomenon:  smaller  order  effects  do  not  affect  the  power  law  exponent. 

For  example,  our  theory  gives  a  mechanism  by  which  (r  =  3.  In  reality,  we  observe:  7  =  ar  +  b, 
where  a  and  b  are  other  random  factors  not  modeled  in  the  theory.  We  will  still  have  (ri  =  (r  =  3  if  a 
and  b  have  thinner  tails  than  r  (£a,  £{,  >  3).  If  the  theory  of  r  captures  the  first  order  effects  (those 
with  dominating  power  law),  its  predictions  for  the  power  law  exponents  of  the  noisy  empirical 
counterpart  ?  will  hold. 

Proof.  See  Breiman  [1965]  and  Gnedenko  and  Kolmogorov  [1968]  for  rigorous  proofs,  and 
Sornette  [2000]  for.  heuristic  derivations.    Here  we  just  indicate  the  proofs  for  the  simplest  cases. 
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By  induction  it  is  enough  to  prove  the  properties  for  n  =  2  variables. 


P  (max  {X,Y)>  x)  =  l-  P  (max  (X,Y)<x)  =  l-P  {X  <  x  and  Y  <  x) 

=  l-P(X<x)P(Y<x)  =  l-(l-^-](l-^-} 


X^X  J    \  X^  )  i"nin((x,Cy)' 

where  C"  =  C  if  Cx  <  Cy,  <?"  =  C  if  Cx  >  Cy,  and  C  =  C  +  C  if  Cx  =  Cy- 

CC 
P  (min  (X,  Y)  >  x)  =  P  (X  >  x  and  y  >  x)  =  P  (X  >  x  )  P  (Y  >  x) 


xCx+Cy 
Finally,  if  P  {X  >  x)  ~  Ca;-^,  then 

P  (XQ  >  x)  =  P  (X  >  xl'a\  ~  C  (i1/Q)  _CX  ~  Cx-^'/Q. 


C.  Estimating  Power  Law  Exponents 

There  are  two  basic  methodologies  for  estimating  power  law  exponents.  We  illustrate  them 
with  the  example  of  absolute  returns.  In  both  methods,  one  first  selects  a  cutoff  of  returns,  and 
orders  the  observations  above  this  cutoff  as  rm  >  •  •  •  >  T(ny  There  is  yet  no  consensus  on  how 
to  pick  the  optimal  cutoff,  as  systematic  procedures  require  the  econometrician  to  estimate  further 
parameters  [Embrechts,  Kluppelberg,  and  Mikosch  1997].  Often,  the  most  reliable  procedure  is  to 
use  a  simple  rule,  such  as  choosing  all  the  observations  in  the  top  5  percent. 

The  first  method  is  a  "log  rank  log  size  regression",  where  (  is  estimated  as  the  the  OLS 
coefficient  on  rn\  in  the  regression  of  log  of  the  rank  i  on  the  log  size: 

(43)  ]ni:=A-  (OLS In r(i)  +  noise 

with  standard  error  £OLS  ■  (n/2)~  '  [Gabaix  and  Ioannides  2004],  This  method  is  the  simplest, 
and  yields  a  visual  goodness  of  fit  for  the  power  law.  This  is  the  approach  used,  for  instance,  in 
Figure  1.  The  second  method  is  Hill's  estimator 

71-1 


i=\ 


(44)  C"l"  =  (n-l)/£(lnr(i)-lnr(n)) 

i 

which  has  a  standard  error  t^l"n-1/2. 
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Both  methods  have  pitfalls,  discussed  in  Embrechts,  Kluppelberg,  and  Mikosch  [1997,  pp. 330- 
345]  and  Gabaix  and  Ioannides  [2004].  One  large  pitfall  is  the  assumption  of  independent  obser- 
vations. In  reality,  trading  activity  is  autocorrelated  which  causes  standard  errors  to  be  underesti- 
mated; however,  point  estimates  remain  unbiased.  In  a  future  paper  we  plan  to  propose  a  method 
of  estimating  the  standard  errors.  In  any  case,  the  stability  of  the  estimates  across  different  periods, 
countries  and  classes  of  assets  gives  us  confidence  that  the  empirical  estimates  we  report  here  are 
robust. 

With  the  samples  of  millions  of  points  available  in  finance,  standard  errors  are  so  small  that 
one  can  reject  essentially  any  null  hypothesis.  Hence,  researchers  estimating  power  laws  typically 
do  not  use  tests  to  see  if  a  distribution  with  more  parameters  would  offer  a  better  fit.  With  so 
many  data  points,  statistical  tests  would  always  justify  a  higher-dimensional  parameterization,  even 
though  economically,  the  improvement  in  fit  would  be  minimal.  Rather  £  is  best  interpreted  as  the 
optimal  one-parameter  approximation  of  the  tail  by  a  Pareto  family.  Explaining  the  value  of  this 
one-parameter  approximation  is  already  a  difficult  challenge.  Explaining  the  higher  order  terms 
may  be  best  left  for  future  decades  of  research. 

Appendix  2:  Proofs 


Proof  of  Proposition  2.  We  use  T  =  V/V  and  B  (1)  =  0  to  calculate: 


var 


l+T 


B  (t)  dt 


var     I      (I    dB  (1  +  u)  ]  ds 


var 


=  var 
T 


T   /    rT 


.JO      \Ju 


[    (T-u)dB(u+l)    =   f    (T 
Jo  J      Jo 


'  du 


ds\  dB{u+  1) 
T3        V3 


3        W6 


The  liquidity  provider  sells  V  shares  to  the  fund  at  a  price  p  +  it  +  r,  and  replenishes  her 
inventory  during  [1, 1  +  T]  at  a  total  cost  K  =  ^  p  (t)  Vdt.  Her  net  income  from  the  transaction 
is: 


fl+T  r\+T 

W  =  {p  +  -k  +  t)  V  -   /         p(t)Vdt  =  (p  +  tt  +  t)V  -    /         {p  +  n  +  aB  (£))  Vdt 

rl+T 

=  tV  -o-V  /         B{t)dt. 
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Her  utility  is: 


/     _,         r   rl+T 
U  =  E  [W]  -  A  (varW)S/2  =  tV  -  A  I  a2V  var     /         B  (t 


t)dt 


a/2 


=  tV  -  A    — = 
31/ 


a2y3^/2 


The  fund  has  full  bargaining  power,  and  so  leaves  the  liquidity  supplier  with  a  reservation  utility 
U  =  0.  This  implies: 

Economically,  the  liquidity  provider  purchases  the  stock  back  at  an  average  price  p  =  T~l  Jj       p  (t)  dt, 

,      ,        ■  (v\l/2 

which  has  expected  value  p  +  ir  and  standard  deviation  a  I  -=  I       .  The  temporary  impact  r  is  the 

/  v  \  1/2 

compensation  for  this  price  risk  of  a  I  r=  J 

Proof  of  Proposition  4-    In  this  proof  we  use  the  notation  V  (Af)  rather  than  V  (M,  S).    The 
Lagrangian  is: 

£  =  J  V{M){M  -R{V  (M)))  /  (M)  dM  -  n  f  V  (Af)  i?  (V  (Af ))  /  (Af )  dAf 
=   /V  (Af )  (Af  -  (1  +  fi)  hV  (Af  )7)  /  (Af )  dAf. 

It  is  sufficient  to  optimize  on  V  (Af )  separately  for  each  Af  : 

0  =  „TffL  =  __^_  [y  (M)  M  -  (1  +  M)  W  (Af  )1+71  /  (Af) 


(45) 


<9V(Af)       SV(M) 
■  0  =  M  -  (1  +  n)  (1  +  7)  hV  ( Af  )7 

.  y  (M)  =  [(1  +  /x)  (1  +  7)  h]~lh  M1^. 


Thus,  using  Equation  17, 
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n  I  c  =  o 


=  £ 


/iV(Af)1+7/S 


1+7 


=  /i£ 


,1/ 


1+1/7 


[(i+Ao(i+7)/r(1+1/7)/s. 


Constraint  (18)  binds  iff  fi  >  0,  i.e. 


(46) 


5<5* 
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with: 

(4?)  .s*  =   f?[M^/7][(l  +  7)]-(H-l/7) 

If  the  constraint  binds,  —  E  \rt  \  C  =  0    =  A.  This  implies: 

[(l+M)(l+7)/l]-<1+1^=  VV 


ttS  [M1+1/7] 

i 
and  going  back  to  Equation  45,  we  get  V  (M)  =  uM1/tS'1/(1+t)  with: 

/  \  1/(1+7) 

(48)  w=     '  ' 


JlE  [Ml+lh]  J 

The  expression  for  R  comes  from  R  =  hV1 '. 

To  calibrate  S*,  we  use  the  following  parameters,  which  we  view  as  simply  indicative:  7  =  1/2, 
E  [M3]  =  10  percent  (which  is  less  than  the  annual  standard  deviation  of  the  market,  hence  likely 
to  be  conservative),  A  =  2  percent  of  price  impact  costs  paid  annually.50  We  take  a  price  impact, 
motivated  by  Sections  II. C  and  III. A:  R(V)  =  Act  (tj)  ,  where  a  =  daily  market  volatility 
=  0.01,  A  =  l/2<  which  means  that  up  to  A2  =  25  percent  of  the  market  fluctuations  are  due  to  our 
effects,  D  =  daily  market  turnover.  Using  the  1999  number  of  a  total  equity  market  capitalization 
of  $18  trillion,  and  a  50  percent  annual  turnover,  D  =  1/2  x  $18  trillion/250  =  $36  billion.  So 

D        E\M3]       m 

S*  = s       o,      =  $21trillion. 

A2  (3/2)3     cj2A 

Proof  of  Proposition  5.  We  start  from  Equation  21.  We  apply  the  rules  in  Appendix  1  to  derive: 

Cfi  =  QhviMSi/^+-,)  =  (MS-y/^+f)  by  applying  (41) 
=  min  [Cm,  Csr/cn-r)]  by  applying  (38) 
by  applying  (42) 


mm 


Cm, Cs 

7 


which  proves  the  Proposition.  One  derives  (y  in  the  same  way. 

Proof  of  Proposition  6.   Assumption  2  implies  Cm  >  1  +  1/7  =  3.    Then,  Proposition  5  gives 


^f  the  fund  gets  F  signals  per  year,  5"  is  divided  by  F1/2,  as  M  is  divided  by  F1^2  and  A  by  F. 
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(R  =  min  (3,  Cm)  =  3  and  (v  =  -y(R  =  3/2. 

Proof  of  Proposition  7.  We  will  start  with  the  following  Lemma,  which  means  that  if  X  has 
fatter  tails  than  Y,  then  E  [X  \  XY  =  z]  is  proportional  to  z  for  large  z.  The  reason  is  that  an 
extreme  value  of  XY  probably  comes  from  an  extreme  value  of  X. 

Lemma  8  Suppose  that  X  and  Y  are  independent  random  variables,  with  exact  power  distribu- 
tions: P{X  >  x)  =  (i/x*)"Cl,  P(Y  >y)  =  (y/y*r(Y  for  x  >  x*  and  y  >  y* .  Define  z*  =  x*y*. 
Assume  Cx  <  Cv-  Then: 

(49)  E  {X  |  XY  =  z]  =  L(z)z 

(50)  '  LM=*[f*"W-] 

L  (z)  is  a  slowly  varying  function.  If  Cx  <  Cy, 

(5i)  &*<*>= ^W 

If  Cx  =  Cy, 

(52)  L  (z)  =  -}  ~  Z*JZ.  ~  1J-  for  z  -»  oo. 

v     '  w       y*ln(z/z*)       y*lnz 

Proof  of  Lemma  8.  By  normalization,  it  is  enough  to  study  the  case  x*  =  y*  =  1.  Calling  /  and 
(j  the  densities  of  X  and  Y".  By  Bayes'  rule,  p(X  =  x  \  XY  =  z)  =  kf  (x)  g  (z/x)  jx  for  a  constant 
k.  So: 

rIy,vv         ,       J  f(x)9(z/x)/x  dx         J  f(z/y)g(y)/y2  dy  . 

E[X\XY  =  z}  =  Jrf/,     .    .      .  =  zJ  by  the  change  of  variable  x  =  z  y 

J  f{x)g(z/x)/x  dx         J  f{z/y)g{y)/y1  dy 

=  Jlz^/yrix~19(y)/y2dy  =  J'y^-'gjy)  dy  =  _E[Y^-11Y<Z]  =  zL 
Z  I1z^/y)~ix~1g(y)/ydy  I1zy^g(y)dy  e[y^iy<z] 

When  Cx  <  Cy,  E  [Y^1y<2]  -»  E  [YCy]  <  oo 
When  (x  =  Cy, 

l        =  J'y^^gjy)  dy  =  ff^-iy-Cr-i  dy.  =  fiy-2dy  =  1  -  z~l 
fiyix9{y)dy  ffytxy-Cy-1  dy         fcy-ldy  \az 

D 
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For  the  proof  of  Proposition  7,  we  use  Lemma  8  with  X  =  M  and  Y  =  hv1  S1^l+1\  We  call 
Q^  j  the  exponent  of  the  distribution  agents  use  when  they  calculate  the  conditional  expectation 
(28).  Given  the  hypothesis  Cm  3  ^  Cfti  Equation  23  gives  (^  J  <  (r  <  (l  +  ^)Cs  =  Cy-  So  using 
Lemma  8, 


E 


M\R  =  /i^M57/(1+7)  =  XY 


RL{R) 


for  a  slowly  varying  function  L  (R)  of  R.  In  the  case  £w  J  <  ( 1  +  ^  )  Csi  we  get:  lim/j^oo  L  (R) 


b'  =  E 


Y^m      l 


/■subj 


jE   Y*>m      ,  a  constant.    Finally,  given  tt  =  RL(R),  and  L  is  slowly  varying, 


Appendix  3:  Confidence  Intervals  and  Tests  When  a  Variable  Has 

Infinite  Variance 

A.  Construction  of  the  Confidence  Intervals  for  Figure  VI 

In  a  given  bin  conditioned  by  Q  =  Qz,  with  k  elements  r2 ,-..,r2,  the  point  estimate  of  E  [r  |  Q  =  Qi\ 
is  the  sample  mean  of  the  r?  which  we  call  m.  Getting  a  confidence  interval  for  m  is  delicate,  as  r2 
has  infinite  variance,  so  the  standard  approach  relying  on  asymptotic  normality  is  invalid.  But  the 
theory  of  self-ndrmalizing  sums  of  Logan,  Mallows,  Rice,  and  Shepp  [1973]  shows  that  if  fi  is  the 
true  mean  and  a  is  empirical  standard  deviation  of  the  r2-  in  the  bin  with  k  observations,  then  the 
ratio  t  =  fc1'2  (m  —  fj.)  /a  follows  a  non-degenerate  distribution  for  large  k.  By  Monte  Carlo  analysis 
we  simulate  draws  following  a  power  law  with  exponent  1.5,  which  is  the  exponent  of  r  ,  and  we 
tabulate  2.5  percent  and  97.5  percent  quantiles  of  —  t,  which  we  call  —  x~  =  —1.1  and  x+  —  5.5. 
They  differ  from  their  finite  variance  value,  which  would  be  x~  =  X+  =  1-95.51 

To  construct  95  percent  confidence  intervals,  we  can  first  calculate  the  empirical  standard  error 
Arf  =  aTi  k~l>2,  the  sample  standard  deviation  of  the  observations  divided  by  the  square  root  of 
the  number  of  observations.  A  95  percent  confidence  interval  is  [m^  —  x~Ar?,mi  +  x"1"^2]-  We 
should  stress  that  we  make  the  simplifying  assumption  of  idenpendent  and  identically  distributed 
draws.  Given  that  the  data  are  likely  to  be  autocorrelated,  our  confidence  intervals  are  likely  to  be 
too  narrow. 

B.  Test  of  Relation  (7):  E  [r2  |  Q]  =  a  +  (5Q 

■"  When  k  is  finite,  there  is  some  sensitivity  of  x~  and  X+  to  the  underlying  distribution.   We  take  a  pure  power 
law  P  (r2  >  i)  =  x-3/2  for  x  >  1,  and  k  =  200,  to  reflect  our  typical  sample  size  in  bins  of  extreme  values. 
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For  each  bin  Qi  of  Q,  we  set  r2  =  E  [r2  |  Q  =  Qi],  and  Ar2  sample  standard  error  in  interval 
i.  By  least  squares  we  fit  an  affine  relationship  E  [r2  |  Q]  =  g  (Q),  with 

5(Q)  =  0.07  +  0.60Q 
(0.59)      (0.013). 

The  standard  errors  are  in  parentheses  and  the  R2  =  0.90.  We  find  that  for  all  values  Qi  >  3,  the 
predicted  value  g  (Q,)  belongs  to  the  95  percent  confidence  interval:  g  (Qi)  £  \r2  —  x~Ar2,  r2  +  x+  Arf] . 
We  conclude  that,  at  the  95  percent  confidence  level,  we  cannot  reject  the  linear  form  E  [r2  |  Q]  = 

g(Q)forQ>3. 
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x  (Units  of  standard  deviation) 


Figure  I:  Empirical  cumulative  distribution  of  the  absolute  values  of  the  normalized  15  minute 
returns  of  the  1,000  largest  companies  in  the  Trades  And  Quotes  database  for  the  2-year  period 
1994-1995  (12  million  observations).  We  normalize  the  returns  of  each  stock  so  that  the  normalized 
returns  have  a  mean  of  0  and  a  standard  deviation  of  1.  For  instance,  for  a  stock  i,  we  consider  the 
returns  r'it  =  (ja  —  r;)  /oy^,  where  T{  is  the  mean  of  the  r^'s  and  aT)i  is  their  standard  deviation. 
In  the  region  2  <  x  <  80  we  find  an  ordinary  least  squares  fit  lnP(|r|  >  x)  =  —  (Vinz  +  b,  with 
Cr  =  3.1  ±  0.1.  This  means  that  returns  are  distributed  with  a  power  law  P(\r\  >  x)  ~  x~^r  for 
large  x  between  2  and  80  standard  deviations  of  returns.  Source:  Gabaix  et  al.  [2003]. 
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Figure  II:  Probability  density  function  of  the  returns  normalized  5  minute  returns  of  the  1,000 
largest  companies  in  the  Trades  And  Quotes  database  for  the  2-yr  period  1994-1995.  The  values 
in  the  center  of  the  distribution  arise  from  the  discreteness  in  stock  prices,  which  are  set  in  units  of 
fractions  of  U.S.  dollars,  usually  1/8,  1/16,  or  1/32.  The  solid  curve  is  a  power-law  fit  in  the  region 
2  <  x  <  80.  We  find  £  =  3.1  ±  0.03  for  the  positive  tail,  and  £  =  2.84  ±  0.12  for  the  negative  tail. 
The  dotted  line  represents  a  Gaussian  density.  Source:  Plerou  et  al.  [1999]. 
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Figure  III:  Empirical  cumulative  distribution  function  of  the  absolute  value  of  the  daily  return  of  the 
Nikkei  (1984-97),  the  Hang  Seng  (1980-97),  and  the  S&P  500  (1962-96).  The  apparent  power-law 
behavior  in  the  tails  is  characterized  by  the  exponents  (r  =  3.05  ±  0.16  (Nikkei),  £r  =  3.03  ±  0.16 
(Hang-Seng),  and  £r  =  3.34  ±  0.12  (S&P  500).  The  fits  are  performed  in  the  region  \r\  between  1 
and  10  standard  deviations  of  returns.  Source:  Gopikrishnan  et  al.  [1999]. 
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Figure  IV:  Cumulative  distribution  of  the  conditional  probability  P{\r\  >  x)  of  the  daily  returns 
of  companies  in  the  CRSP  database,  1962-1998.  We  consider  the  starting  values  of  market  cap- 
italization K,  define  uniformly  spaced  bins  on  a  logarithmic  scale,  and  show  the  distribution  of 
returns  for  in  each  bin:  K  e  [105, 106]  (o),  K  6  [106, 107]  (♦),  K  6  [107, 108](D),  K  G  [108, 109]  (a). 
K  is  measured  in  1962  constant  dollars,  (a)  Unnormalized  returns.  Each  cumulative  distribution 
corresponds  to  a  bin  of  sizes.  Small  stocks  are  to  the  right,  because  they  are  more  volatile,  (b) 
Returns  normalized  by  the  average  volatility  ax  of  each  bin.  The  plots  collapsed  to  an  identical 
distribution,  with  £r  =  2.70  ±  .10  for  the  negative  tail,  and  £r  =  2.96  ±  .09  for  the  positive  tail.  The 
horizontal  axis  displays  returns  that  are  as  high  as  100  standard  deviations.  Source:  Plerou  et  al. 
[1999]. 


47 


cr  10 


r^> 


C/} 


>■-» 


10' 
10c 

-1 


10" 
10" 
10 


-4 


£    10" 


10 
5    10"7 
-o    10  8 
jo    10 
*-  10 


10 
10 


-11 


•12 


f 


-i — i  i  inn 1 — i — miTTi 1 — i — rt  r  n  it 1 — i — ttttt 


rm 


1+^  =2.5 


■  NYSE  (1000  stocks) 
•  Paris  Bourse  (30  stocks) 
LSE  (250  stocks) 


Mill I I L-L  lllti 


■      ■     ■    ■    II   ll L_ 


10  10L         10  10  \       10; 

Trade  size  q 


10" 


10' 


Figure  V:  Probability  density  of  normalized  individual  transaction  sizes  q  for  three  stock  markets  (i) 
NYSE  for  1994-5  (ii)  the  London  Stock  Exchange  for  2001  and  (iii)  the  Paris  Bourse  for  1995-1999. 
OLS  fit  yields  lnp  (x)  =  —  (1+C?)  lnx+constant  for  Qq  =  1.5 ±0.1.  This  means  a  probability  density 
function  p(x)  ~  x~ (1+^«))  and  a  countercumulative  distribution  function  P  (q  >  x)  ~  x~^q.  The 
three  stock  markets  appear  to  have  a  common  distribution  of  volume,  with  a  power  law  exponent 
of  1.5  ±  0.1.  The  horizontal  axis  shows  invidividual  volumes  that  are  up  to  104  times  larger  than 
the  absolute  deviation,  \q  —  q\. 


48 


100 


a 
■__■ 

LU 

|     10 

+■■ 
0) 

km 

•a 

o 

i_ 

(0 

3      1 

or         ' 

(A 

0) 
D) 

n 

u 
0) 

£    0.1 


^^lULu^-;^ 


j i i  imiii i i i_u 


_i i i  i  i  1 1  ii i i i  i  i  1 1 1| i i 


H 


0.001  0.01  0.1  1  10  100 

Aggregate  volume  Q 


Figure  VI:  Conditional  expectation  E  [r2  |  Q]  of  the  squared  return  r2  in  At  =  15  minutes,  given 
the  aggregate  volume  Q  in  At.  r  is  in  units  of  standard  deviation,  and  Q  in  units  of  absolute 
deviation,  \Q  —  Q\.  The  results  are  averaged  over  the  largest  100  stocks  in  the  New  York  Stock 
Exchange  market  capitalization  on  January  1,  1994.  The  data  spans  the  2-year  period  1994-95 
and  is  obtained  from  the  Trades  and  Quotes  database,  which  records  all  transactions  for  all  listed 
securities  in  the  NYSE,  AMEX  and  NASDAQ.  Formal  tests  reported  in  Appendix  3  show  that  one 
cannot  reject  £[r2|Q]  =  a  +  (3Q  large  enough  (Q  >  3).  This  is  consistent  with  a  square  root  price 
impact  of  large  trades.  Appendix  3  reports  the  procedure  used  to  compute  the  95%  confidence 
intervals. 
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Figure  VII:  Cumulative  distribution  of  the  size  (assets  under  managements)  of  the  top  mutual  funds 
in  1999.  Source:  Center  for  Research  on  Security  Prices. 
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